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A  PARADIGM  FOR  REASONING  BY  ANALOGY 

Robert  E,  Kline 
Stanford  Research  Institute 
Menlo  Park,  California 
U.5.A. 

ABSTRACT 

A  paradigm  enabling  heuristic  problem  solving 
programs  to  exploit  an  analogy  between  a  current 
unsolved  problem  and!  a  similar  but  previously 
solved  problem  to  simplify  its  search  for  a  solu¬ 
tion  is  outlined.  It  is  developed  in  detail  for 
a  first-order  resolution  logic  theorem  prover. 
Descriptions  of  the  paradigm,  implemented  LISP 
programs,  and  preliminary  experimental  results 
are  presented.  This  is  believed  to  be  the  first 
system  that  develops  analogical  information  and 
exploits  it  so  that  a  problem-solving  program  can 
speed  its  search. 

INTRODUCTION 

An  intelligent  man  thinks  deeply  and  learns 
from  his  past  experiences.  Contemporary  theorem- 
proving  and  problem-solving  systems  are  continu¬ 
ally  designed  to  think  ever  more  deeply  and  to 
ignore  their  past  completely.  A  problem  solver 
designed  in  any  of  the  contemporary  paradigms 
(such  as  resolution  (1),  GPS  (2),  and  REF-ARF  (3)) 
solves  the  same  problem  the  same  way  each  time 
it  is  presented.  A  fortiori,  they  are  unable  to 
exploit  similarities  between  new  and  old  problems 
to  hasten  the  search  for  a  solution  to  the  new 
one.  ZORBA,  outlined  in  this  paper,  is  a  para¬ 
digm  for  handling  some  kinds  of  analogies.  This 
is  the  first  instance  of  a  system  that  derives  the 
analogical  relationship  between  two  problems  and 
outputs  the  kind  of  Information  that  can  be  use¬ 
fully  employed  by  a  problem-solving  system  to 
expedite  its  search.  As  such,  ZORBA  is  valuable 
in  three  ways: 

(1)  It  shows  how  nontrivial  analogical  reason¬ 
ing  (AR)  can  be  performed  with  the  tech¬ 
nical  devices  familiar  to  heuristic  pro¬ 
grammers,  e.g.,  tree  search,  matching, 
and  pruning. 


* 

In  Ref.  (4),  I  show  that  there  are  several  kinds 
of  analogies  from  an  information-processing 
point  of  view.  We  should  hardly  expect  one 
paradigm  to  include  them  all.  Restrictions  on 
the  varieties  of  analogy  handled  by  ZORBA  are 
described  in  the  section  entitled  "Necessary 
Conditions  for  an  Analogy." 


(2)  It  provides  a  concrete  information- 
processing  framework  within  which  and 
against  which  one  can  pose  and  answer 
questions  germain  to  AR. 

(3)  Since  it  is  implemented  (in  LISP),  it  is 
available  as  a  research  tool  as  well  as 
a  gedanken  tool. 

The  last  two  contributions  are  by  far  the  most 
important,  although  our  attention  will  focus  upon 
the  first.  In  the  30* s  and  60' s,  many  researchers 
felt  that  analogical  reasoning  would  be  an  impor¬ 
tant  addition  to  intelligent  problem-solving  pro¬ 
grams.  However,  no  substantial  proposals  were 
offered,  and  the  idea  of  AR  remained  rather  nebu¬ 
lous,  merely  a  hope.  ZORBA  may  raise  more  ques¬ 
tions  of  the  "what  if?"  variety  than  it  answers. 
However,  now,  unlike  1968,  we  have  an  elementary 
framework  for  making  these  questions  and  their 
answers  operational. 

ZORBA  PARADIGM 

Although  prior  to  ZORBA  there  were  no  concrete 
paradigms  for  AR,  there  was  an  unarticulated  un¬ 
developed  paradigm  within  the  artificial  intel¬ 
ligence  Zeitgeist.  Suppose  a  problem  solver  had 
solved  some  problem  P  and  has  its  solution  S.  If 
a  program  is  to  solve  a  new,  analogous  P  ,  it 
should  do  the  following: 

(1)  Examine  S  and  construct  some  plan  (schema) 
S;  that  could  be  used  to  generate  S. 

(2)  Derive  some  analogy  C:  P^  P. 

(3)  Construct  G  1(S/)  =  •  S*  . 

(4)  Execute  Sr .  to  get  S  ,  the  solution  to  P  . 

A  A  A 

If  P  was  solved  by  executing  a  plan,  then  S7 
would  be  available  and  step  (1)  could  be  omitted. 
Although  nobody  has  explicated  this  idea  in  pub¬ 
lications,  from  various  conversations  with  workers 
in  the  field,  I  believe  that  the  preceding  descrip¬ 
tion  is  close  to  the  paradigm  that  many  would  have 
pursued.  As  such,  it  constitutes  the  (late-60's) 
conventional  wisdom  of  artificial  intelligence. 
Certainly  this  (planning)  paradigm  is  attractively 
elegant!  However,  in  1969,  when  this  research 
was  begun,  it  was  an  Inappropriate  approach  for 
two  reasons: 

(1)  There  are  no  planning- oriented  problem 
solvers  that  are  fully  implemented  and 
operate  in  a  domain  with  interesting 
nontrivial  analogies.  This  state  of 

* 

PLANNER  at  MIT  and  QA4  at  SRI  are  two  current 
planning-oriented  problem  solvers  that  are  under 
development.  The  first  is  partially  implemented 
and  the  second  exists  only  on  paper.  It  is  not 
yet  clear  what  problem-solving  power  PLANNER  will 
have,  and  how  effective  it  will  be  in  domains 
with  interesting  analogies. 
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affairs  probably  will  change  in  the  next 
few  years,  but  it  now  renders  difficult 
any  research  that  depends  on  the  existence 
of  such  a  system. 

(2)  Given  the  plans  generated  by  such  a  sys¬ 
tem,  it  is  hard  to  know  a  priori  at  what 
level  of  generality  the  derived  analogy 
will  map  into  an  executable  analogous 
plan.*  If  fails,  is  G  too  strong,  or 
.  wrong?  Should  G  be  modified  and  a  variant 

computed,  or  should  the  system  keep 
G,  and  just  back  up  its  planner  and  gene¬ 
rate  an  alternative  subplan  using  its  own 
planning  logic?  At  best  this  is  a  rather 
complex  research  issue  which  would  in¬ 
volve  a  good  planning-oriented  problem 
solver  as  an  easily  accessible  research 
tool.  At  worst,  the  preceding  paradigm 
may  be  too  simple  and  the  development  of 
a  suitable  G  may  be  interactive  with  how 
much  successful  problem-solving  has  pro¬ 
ceeded  so  far.  (A  complete  G  should  not 
be  attempted  before  some  problem  solving 
begins  and  is  extended  as  needed  in  the 
course  of  solving  P  . ) 

Happily,  there  is  an  alternative  approach  that 
circumvents  the  preceding  difficulties.  Consider 
a  system  that  has  solved  some  problem  P  and  is 
posed  with  a  new  (analogous)  P  to  solve.  Clearly, 
it  must  operate  on  some  large  data  base  sufficient 
to  solve  both  P  and  P^.  (See  Figure  1.)  In  ad¬ 
dition  to  the  subbase  for  solving  P  and  P^  there 


FIGURE  1  VENN  DIAGRAM  OF  THEOREMS  IN  DATA  BASE 


are  likely  to  be  even  more  theorems  in  the  set 
D  -  (D^  U  D^).  Now,  given  P  it  is  impossible  to 
infer  a  min'imal  D^.  In  practice,  a  user  may  se¬ 
lect  some  D  s.t.  D  CD  CD  which  the  problem 

2  i  —  2 

solver  will  access  to  solve  P.  If  one  studies 
the  searches  that  problem  solvers  generate  when 


* 

See  P.ef.  4  for  a  discussion  of  this  issue. 


they  work  with  nonoptimal  data  bases,  it  is  ob¬ 
vious  that  many  of  the  irrelevant  inferences  that 
are  generated  are  derived  from  the  data-base  as¬ 
sertions  (theorems,  axioms,  facts)  in  D  -  (or 
D2  -  D^).  In  fact,  as  the  number  of  theorems  ir¬ 
relevant  to  the  solution  P  becomes  large,  the 
number  of  irrelevant  inferences  derived  from  this 
set  begins  to  dominate  the  number  of  irrelevant 

inferences  generated  within  D  and  its  descendants 
*  1 
alone.  In  fact,  while  a  problem  solver  might 

solve  P  given  an  adequate  and  small  D  ,  it  may  be 
swamped  and  run  out  of  space  before  a  solution 
given  a  D  that  is  much  larger  than  needed. 
Clearly,  one  effective  use  of  analogical  informa¬ 
tion  would  be  to  select  a  decent  subset  D  of  D 

2 

such  that  size  [D^]  s  size  [D2]  «  size  [D]  .  For 
example,  a  typical  theorem  in  algebra  provable  by 
QA35 — a  resolution  logic  theorem  proof — may  re¬ 
quire  only  10  axioms  (D  )  while  the  full  alge¬ 
braic  data  base  has  250  axioms.  If  a  system 
could  select  a  D2  such  that  size[D2]  =  15  axioms, 
a  massive  saving  in  search  could  be  had.  In  fact, 
the  theorem  that  would  be  unprovable  on  a  D  with 
size[D]  =  250  would  now  be  provable. 

A  second  kind  of  information  that  would  be 
useful  to  help  solve  P  would  be  a  set  of  lemmas 
(or  subgoals)  L  ,  ...  £.  whose  analogs  G(l^),  ... 
C(L.)  could  be  solved  bp  the  system  before  at¬ 
tempting  P  . 

A 

At  this  point  I  will  not  discuss  hpw  to  recog- 

?  Q 

nize  a  lemma  and  generate  its  analog; v  instead, 

I  merely  want  to  note  that  lemmas  may  be  effec¬ 
tively  used  without  using  a  planning  language 


* 

Even  given  an  optimal  data  base,  a  problem 
solver  will  generate  some  irrelevant  inferences, 
t 

In  general,  automatic  problem  solvers  and  theorem 
pr overs  run  out  of  space  rather  than  time  when 
they  fail  to  solve  a  problem.  Ernst (2)  empha¬ 
sizes  this  point  with  regard  to  GPS,  and  I  have 
had  similar  experiences  with  QA3(5),  a  resolu¬ 
tion  logic  theorem  prover. 

Recognizing  lemmas  depends  upon  the  problem¬ 
solving  system.  For  example,  in  resolution 
logic,  some  good  criteria  for  lemmahood  are: 

(1)  A  ground  unit  used  more  than  twice  (or 
k  times)  in  a  proof. 

(2)  A  unit  that  is  a  merge. 

(3)  A  clause  that  is  the  "least  descendant" 
of  more  than  2  (or  k)  units. 

§ 

'Generating  a  lemma  depends  upon  the  system's 
ability  to  associate  variables  with  variables 
and  that  may  be  tricky  when  skolem  functions  are 
introduced . 
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that  forces  backup  in  case  of  failure.  Suppose  we 
somehow  get  Gd^),  ...  G(Lj)..  A  typical  planner 
would  order  the  GfL^),  e.g.,  G(L^),  Gd^,)  ...  etc., 
attempt  to  solve  them  in  sequence,  and  stop  if  any 
lemma  fails  to  be  solved.  In  contrast,  we  merely 
need  to  attempt  each  G(l^).  If  we  get  a  solution, 
add  Gd^)  to  the  data  base  (like  a  theorem)*  and 
continue  with  the  next  lemma.  If  we  fail,  con¬ 
tinue  anyway.  At  worst,  we  wasted  some  computation 
time.  Each  useful  G(L^)  decreases  the  number  of 
steps  in  the  solution  of  and  may  decrease  the 
depth  of  the  solution  tree.  Thus,  lemmas  are 
helpful  in  getting  a  faster  solution.  Note,  how¬ 
ever,  that  a  successful  G(L_.)  need  not  be  used  in 
the  solution  of  PA.  It  is  merely  available. 

Thus,  we  are  not  bound  by  the  fail-backup  orienta¬ 
tion  of  sequential  planning  logics. 

In  summary,  if  we  use  analogical  information 
to  modify  the  environment^  in  which  a  problem 
solver  operates,  we  can  effectively  abbreviate  the 
work  a  problem  solver  must  perform.  Of  course,  a 
well-chosen  environment  will  always  lead  to  a  more 
efficient  search.  Usually,  we  have  no  idea  how  to 
tailor  a  subenvironment  automatically  to  a  par¬ 
ticular  problem.  Here  we  do  it  by  exploiting  its 
analogy  with  a  known  solved  problem.  Now,  the 
representations  used,  the  analogy-generating  pro¬ 
grams,  and  the  types  of  additional  information 
output  will  depend  upon  the  problem-solving  system 
(and  even  the  domain  of  application).  Any  further 
discussion  needs  to  specify  these  two  items. 

APPLICATIONS  TO  RESOLUTION  LOGIC 


upon  a  particular  paradigm,  two  issues  are  more 
easily  resolved: 

(1)  What  kinds  of  information  are  most  useful 
to  provide  to  the  problem  solver? 

(2)  Which  representations  shall  we  use  to 
describe  the  analogies  and  handle  the 
necessary  data? 

Resolution  logic  is  an  inference  rule  whose 
statements  are  called  clauses. *(1), (5)  Thus,  a 
resolution-oriented  analogizer  will  deal  with 
clauses  and  their  descriptions.  In  contrast,  GPS 
uses  sets  of  objects  to  describe  its  states,  and 
we  would  expect  that  an  analogy  system  devoted  to 
GPS  would  deal  with  (complex)  objects  and  their 
attributes.  Table  1  contrasts  the  kinds  of  in¬ 
formation  helpful  to  QA3  and  GPS.  An  analogy 
facility  developed  for  GPS  would  be  oriented  to 
its  peculiar  information  structures  instead  of 
clauses  and  axioms  indigenous  to  resolution. 


Table  1 

KINDS  OF  INFORMATION  HELPFUL  TO  QA3  and  GPS 


QA3  (Resolution) 
Relevant  axioms 
Expected  predicates 

Lemmas 

Admissible  function 
nestings 


GPS 


Relevant  operators 

Abbreviated  difference 
table 

Subgoals 

Restrictions  on  operator 
applications 


The  preceding  discussion  referred  to  any 
problem  solver  and  is  just  a  proposal.  Computer 
programs  have  been  implemented  to  apply  this  para¬ 
digm  to  a  resolution  logic  theorem  prover,  QA3.(5) 
For  the  class  of  analogies  these  programs  handle, 
this  is  an  accompl ishment .  When  we  begin  to  focus 


* 

In  fact,  under  some  conditions,  the  axioms  used 
to  solve  G(L,)  may  be  deleted  from  D2  so  that 
size  [D21  is  decreased,  and  G(Li)  is  not  at¬ 
tempted  again  inadvertently  during  the  solution 
of  P  . 

+  A 

Here  environment  is  synonymous  with  data  base. 

But  it  can  also  include  permissible  function 
orderings  (in  predicate  calculus)  and  other  kinds 
of  restrictive  information.  Each  rule  restricting 
the  "environment"  could  be  translated  into  an 
equivalent  new  decision  rule  restricting  the  ap¬ 
plication  of  the  inference  procedures  of  the 
problem  solver.  However,  I  find  it  easier  to 
think  of  ZORBA  in  terms  of  modified  environments 
rather  than  (the  equivalent)  modified  decision 
rules. 


I  want  to  digress  briefly  and  describe  the 
kinds  of  theorems  that  the  implemented  system, 
ZORBA-I,  tackles.  Briefly,  they  are  theorem  pairs 
in  domains  that  can  be  axioms tized  without  con¬ 
stants  (e.g.,  mathematics)  and  that  have  one-one 
maps  between  their  predicates.  The  theorems  are 
fairly  hard  for  QA3  to  solve.  For  example, 

ZORBA-I  will  be  given  proof  of  the  theorem 

Tl.  The  intersection  of  two  abelian  groups 
is  an  abelian  group 

and  is  asked  to  generate  an  analogy  with 

T2 .  The  intersection  of  two  commutative  rings 
is  a  cummutative  ring. 


* 

A  clause  is  an  element  in  the  conjunctive  normal 
form  of  a  skolemized  wff  in  the  predicate  cal¬ 
culus.  For  example:  —i  person  [x] 

V  father  [g(x);  x]  is  the  clause  associated  with: 
Vx  person  [x]  ■*  Sy  father  [y;x]  (every  person 
has  a  father). 
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Given 

T3.  A  factor  group  G/H  is  simple  iff  H  is  a 
maximal  normal  subgroup  of  G, 

Generate  an  adequate  analogy  with 

T4.  A  quotient  ring  A/C  is  simple  iff  C  is  a 
maximal  ideal  in  A. 

None  of  these  theorems  are  trivial  for  contemporary 
theorem  provers.  (See  Table  2,  in  a  later  section, 
for  a  listing  of  additional  theorem  pairs.) 
has  a  35-step  proof  and  Tg  has  a  50-step  proof  in 
a  decent  axioms tizat ion.  A  good  theorem  prover 
(QA3)  generates  about  200  inferences  in  searching 
for  either  proof  when  its  data  base  is  minimized 
to  the  13  axioms  required  for  the  proof  of  or 
to  the  12  axioms  required  for  the  proof  of  T3. 

If  the  data  base  is  increased  to  20-30  reasonable 
axioms,  the  theorem  prover  may  generate  600 
clauses  and  run  out  of  space  before  a  proof  is 
found.  Note  also  that  the  predicates  in  the  prob¬ 
lem  statement  of  these  theorems  contain  only  a  few 
of  the  predicates  used  in  any  proof.  Thus,  T^ 
can  be  stated  using  only  (INTERSECTION;  ABELIAN'}, 
but  a  proof  requires  (GROUP;  IN;  TIMES;  SUBSET; 
SUBGROUP;  COMMUTATIVE}  in  addition.  Thus,  while 
the  first  set  is  known  to  map  into  (INTERSECTION, 
COMMUTATIVERINg} ,  the  second  set  can  map  into 
anything. 

Figure  2  shows  a  set  P  including  all  the 
predicates  in  the  data  base. 


FIGURE  2  VENN  DIAGRAMS  OF  RELATIONS 
IN  STATEMENTS  T,  ta.  AND  D' 


We  know  Pj[  and  P^,  the  sets  of  predicates  in  the 
statements  of  the  new  and  old  theorems,  TA  and  T. 
In  addition,  we  know  the  predicates  in  some 
proof  of  T  (since  we  have  a  proof  at  hand).  We 
need  to  find  the  set  P2  that  contains  the  rela¬ 
tions  we  expect  in  some  proof  of  TA,  and  we  want 
a  map  G;  G(p^)  =  p^. 


Clearly,  a  wise  method  would  be  to  find  some 
C.' ,  a  restriction  of  G  to  P-[  such  that  G^P^)  = 

Po .  Then  incrementally  extend  G7  to  G^,  G^,  ... 
each  on  larger  domains  until  some  G^P^)  =  Pg. 
ZORBA-I  does  this  in  such  a  way  that  each  incre¬ 
mental  extension  picks  up  new  clauses  that  could 
be  used  in  a  proof  of  T^.  In  fact,  if  we  get  no 
new  clauses  from  an  extended  G(,  that  may  be  rea- 
son  to  believe  that  G^  is  faulty.  The  next  sec¬ 
tions  will  describe  the  generation  algorithm  in  a 
little  more  detail. 

ZORBA'S  REPRESENTATION  OF  AN  ANALOGY 

In  the  preceding  sections  I  have  implied  that 
an  analogy  is  some  kind  of  mapping.  The  ZORBA 
paradigm — e.g.,  using  an  analogy  to  restrict  the 
environment  in  which  a  theorem  prover  works — does 
not  restrict  this  mapping  very  much.  For  differ¬ 
ent  intuitively  analogous  theorem  pairs,  this 
mapping  would  need  to  be  able  to  associate  predi¬ 
cates  (and  axioms)  in  a  one-one,  one-many,  or 
many-many  fashion,  possibly  dependent  upon  con¬ 
text.  For  other  theorem  pairs,  one-one  mappings 
and  context-free  mappings  are  adequate.  ZORBA-I 
is  a  particular  set  of  algorithms  that  restricts 
its  acceptable  analogies  to  those  which  map 
predicates  one-one  with  no  context  dependence. 

It  allows  one-many  associations  between  axioms; 
e.g.,  one  axiom  of  the  proved  theorem  is  asso¬ 
ciated  with  one  or  more  axioms  that  will  be  used 
to  prove  the  new,  analogous  theorem.  More  ex¬ 
plicitly,  a  ZORBA-I  analogy  G  is  a  relation 
Gp  X  Gc  x  Gv,  where: 

(1)  Gp  is  a  one-one  map  between  the  predi¬ 
cates  used  in  the  proof  of  the  proved 
theorem  T  and  the  predicates  used  in  the 
proof  of  the  unproved  theorem  T^. 

(2)  Gc  is  a  one-many  mapping  between  clauses. 
Each  clause  used  in  the  proof  of  T  is 
associated  with  one  or  more  clauses  from 
the  data  base  D  that  ZORBA-I  expects  to 
use  in  proving  T^. 

(3)  Gv  is  a  many-many  mapping  between  the 
variables  that  appear  in  the  statement 
of  T  and  those  that  appear  in  the  state¬ 
ment  of  T  . 

A 

Different  sections  of  ZORBA-I  use  these 
various  maps,  e.g.,  Gv  and/or  Gp  and/or  Gc. 

Usually  I  will  drop  the  superscript  and  simply 
refer  to  "the  analogy  G."  Thus  "the  analog  of  an 
axiom  ax  under  analogy  G”  should  be  understood 
to  mean  uc[ax^],  and  will  often  be  mentioned 
simply  as  "the  analog  of  ax^." 

In  the  previous  section  I  refer  to  a  sequence 
of  analogies  G^  ...  G^.  ZORBA-I  usually  does 
not  develop  Gc  in  one  step.  Rather,  it 
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incrementally  extends  some  limited  analogy  into 

one  that  maps  a  few  more  variables,  predicates, 

or  clauses.  This  process  is  described  in  full 

detail  in  the  next  few  sections.  Here,  I  just 

want  to  define  several  terms  that  refer  to  this 

process.  When  I  refer  to  "the  analogy  between 

T  and  T^"  I  refer  to  a  mapping  that  includes 

every  variable  in  the  statement  of  T,  and  every 

predicate  and  clause  used  in  the  proof  of  T. 

This  "complete"  mapping  is  obtained  as  the  final 

step  of  a  sequence  of  mappings  that  contain  the 

associations  of  some  predicates  and  some  clauses. 

I  refer  to  these  incomplete  mappings  as  "partial 

analogies."  In  addition,  we  are  concerned  with 

an  important  relationship  between  two  (partial) 

analogies.  A  (partial  or  complete)  analogy  G^,  is 

an  extension  of  a  partial  analogy  Gj  if  some  of 

G.,  e.g.,  GR,  GR,  GY,  is  a  submap  restriction  of 
■J  J  J  J  ~ 

the  corresponding  submap  G^  to  a  smaller  domain. 

Intuitively,  when  we  add  a  new  predicate  or 

clause  association  to  Gj  so  as  to  create  G^,  we 

say  that  Gj  has  been  extended  to  G^.  We  are  now 

ready  to  survey  ZORBA-I. 

AH  OVERVIEW  OF  THE  ANALOGY-GENERATING  ALGORITHM 

I  want  to  describe  the  ZORBA-I  algorithm  in 
two  stages,  first  briefly  in  this  section  and 
then  in  greater  detail  in  the  following  two  sec¬ 
tions.  I  will  precede  these  descriptions  by  some 
background  on  the  representations  and  information 
available  to  the  system. 

ZORBA-I  is  presented  with  the  following: 

(1)  A  new  theorem  to  prove,  T^. 

(2)  An  analogous  theorem  T  (chosen  by  the 
user)  that  has  already  been  proved. 

(3)  Proof [T]  that  is  an  ordered  set  of 
clauses  ck  s.t.  Yk  ck  is  either 

(a)  A  clause  in  n  T 

(b)  An  axiom 

(c)  Derived  by  resolution  from  two 
clauses 

c  and  c  j  <  k  and  i  <  k. 

i  J 

These  three  items  of  information  are  problem- 
dependent.  In  addition,  the  user  specifies  a 
"semantic  template”  for  each  predicate  in  his 
language.  This  template  associates  a  semantic 
category  with  each  predicate  and  predicate-place 
and  is  used  to  help  constrain  the  predicate 
mappings  to  be  meaningful.  For  example, 

STRUCTURE [SET;  OPERATOR]  is  associated  with  the 
predicate  "group."  Thus,  ZORBA-I  knows  that  ”A" 
is  a  set  and  is  an  operator  when  it  sees 
group[A;*].  Currently,  the  predicate  types  (for 
algebra)  are  STRUCTURE,  RELATION,  MAP,  and  REL- 
STRUCTURE;  the  variable  types  are  SET,  OPERATOR, 
FUNCTION,  and  OBJECT. 


In  addition,  ZORBA-I  can  make  up  a  description 
descr[c)  of  any  clause  c  according  to  the  fol¬ 
lowing  rules  regarding  the  predicates  of  c. 

(1)  Yp  s.t,  p  and  — i  p  appear  in  c,  impcond[p] 
e  descr[c]. 

(2)  V  s.t.  p  appears  in  c.  pos[p]  €  descr[c] . 

p 

(3)  Y  s.t.  —i  p  appears  in  c,  neg[p] 

€  descr [c] . 

Thus,  the  axiom,  every  abelian  group  is  a  group, 

e.g.,  Y(x*)  abelian  [x;  *]  =>  group  [x;  *]  , 

is  expressed  by  the  clause 

c  : — i  abelian  [x;  *]  V  group  [x;  *]  , 

which  is  described  by 

neg  [abelian],  pos  [group] 

Each  element  of  a  description,  e.g.,  pos [group], 
is  a  "feature”  of  the  description.  Each  feature 
corresponds  to  one  predicate,  so  the  number  of 
features  in  a  clause  equals  the  number  of  predi¬ 
cates  in  the  clause.  The  theorem,  the  homomorphic 
image  of  a  group  is  a  group,  e.g., 

Y  (x  y  *i  *2^) 

hom  pp;x;y]  A  group  [x;  *  ] 

=  group  [y;  *  ] 

is  expressed  by  the  clause 

c2:-i  horn  [<p;x;y]  V  group  [x;  *]  V  group  [y; 

and  is  described  by 

neg [hom],  impcond [group] 

Two  different  clauses  may  have  the  same  description. 
Let: 

c^:  —i  intersection  [x;  y;  z]  V  subset[x;y] 

c  :  -i  intersection  [x;  y;  z]  V  subset[x;z] 

4 

Then: 

descr[c3]  =  descr[c^]  =  neg[intersection] , 
pos[subset] 

Clause  descriptions  are  used  to  characterize 
the  axioms  whose  analogs  we  seek.  ZORBA-I  selects 
as  analogs  clauses  that  have  descriptions  that  are 
close  to  the  analogs  of  the  descriptions*  of  axioms 
in  the  known  axiom  set.  Although  in  a  special 
context  ZORBA-1  actually  uses  an  ordering  relation 
on  a  set  of  descriptions  to  find  a  "best  clause," 
it  usually  exploits  a  simpler  approach.  We  will 
say  that  a  clause  c  satisfies  a  description  d  iff 
d  C  descr[c].  Thus,  several  clauses  may  satisfy 
the  same  description. 

*The  "analog  of  a  description"  is  defined  later. 
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Let: 

e  :  — i  intersection^;  y;  z]  V  — 1  group[y;  *] 

V  — 1  group  [z;  *]  V  group  [x;  *] 

c  :  — t  subgroup [x;y;  *]  V  -i  subset [x;y] 

6 

Then,  the  following  statements  are  true: 

(1)  {c2jC5}  satisfy  impcond[group] 

(2)  {c1,c2,c5}  satisfy  postgroup] 

(3)  satisfies  neg [abelian] ,  pos [group] 

(4)  {c3,c4,c6}  satisfy  pos[subset] 

(5)  eg  satisfies  neg[subgroup] ,  postsubset] 

(6)  No  clause  of  these  six  satisfies 
p  os  [  in  ters  ec  t  i  on] 

Clearly,  if  a  description  contains  only  a  few 
features,  then  several  clauses  may  satisfy  it. 

The  semantic  templates  are  used  during  both 
the  INITIAL-MAP  (when  the  predicates  and  variables 
in  the  theorem  statements  are  mapped)  as  well  as 
in  the  EXTENDER,  which  adds  additional  predicates 
needed  for  the  proof  of  TA  and  finds  a  set  of 
axioms  to  use  in  proving  T^.  The  clause  descrip¬ 
tions  are  used  only  by  EXTENDER. 

I  intend  the  brief  description  that  follows 
to  provide  an  overview  of  ZORBA-I  in  preview  to 
the  next  two  sections  of  text,  which  describe  it 
in  considerable  detail.  In  addition,  this  preview 
section  may  be  a  helpful  "roadmap"  for  reference 
when  the  reader  immerses  himself  in  the  details 
that  follow  later  on. 

ZORBA-I  operates  in  two  stages.  INITIAL-MAP 
is  applied  to  the  statements  of  T  and  TA  to  create 
an  G^,  which  is  used  by  EXTENDER  to  start  its  se¬ 
quence  of  GP  and  Gj,  which  terminate  in  a  complete 
G.  INITIAL-MAP  starts  without  a  priori  informa¬ 
tion  about  the  analogy  it  is  asked  to  help  create. 
Both  GP  and  G  are  empty  when  it  begins.  It  uses 
the  system  of  the  wffs  that  express  T  and  TA  as 
well  as  the  restrictions  imposed  by  the  semantic 
categories  to  generate  GP  and  G^  that  include  all 
the  predicates  and  variables  that  appear  in  the 
two  wffs.  For  example,  the  statements  of  Tj  -  T2 
can  contain  three  of  the  nine  predicates  used  in 
proof  [T-^]  and  the  statements  of  T^  -  can  con¬ 
tain  five  of  the  12  predicates  used  in  proof [T^]. 
In  brief,  it  provides  a  starting  point  from  which 
EXTENDER  can  develop  a  complete  G. 

The  INI TIAL-MAP  uses  a  rule  of  inference 
called  ATCMMATCHCatom^;  atom^;  G]  ,  which  extends 
analogy  by  adding  the  predicates  and  mapped 
variables  of  atom^  and  atom2  to  analogy  G.  Thus, 
ATOMM^TCH  now  limits  ZORBA-I  to  analogies  where 
atoms  in  the  statements  of  T  and  TA  map  one-one. 
INITIAL-MAP  is  a  sophisticated  search  program 
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that  sweeps  ATOMMATCH  over  likely  pairs  of  atoms, 
one  of  which  is  from  the  statement  of  T,  the  other 
from  the  statement  of  TA-  Alternative  analogies 
are  kept  in  parallel  (no  backup),  and  INITIAL-MAP 
terminates  when  it  has  found  some  analogy  that  in¬ 
cludes  all  the  predicates  in  the  theorem  state¬ 
ments.  This  one  is  output  as  GP. 

EXTENDER  accepts  a  partial  analogy  generated 
by  INITIAL-MAP  and  uses  it  as  the  first  term  in  a 
sequence  of  successive  analogies  Gj.  The  axioms 
used  in  proof [T]  are  few  in  comparison  to  the  size 
of  the  large  data  base  and  comprise  the  "domain" 
for  a  complete  Gc.  Tor  each  axiom  used  in  proof [T], 
we  want  to  find  a  clause  from  the  data  base  that 
is  analogous  to  it.  The  axioms  used  in  proof[T] 
are  called  AXSET  and  are  used  by  EXTENDER  in  a 
special  way.  Each  partial  analogy  Gj  is  used  to 
partition  AXSET  into  three  disjoint  subsets  called 
ALL[GJ,  SOME[G^],  and  NONE[G.]  . 

If  all  the  predicates  in  an  axiom  ax^  e  AXSET 
are  in  GP,  then  ax.  is  in  ALL[G  ]  ;  if  some  of  its 
predicates  are  in  GP,  then  axk  is  in  SOME[G.]; 
and  if  none  of  its  predicates  are  in  GP,  then  ax^ 
is  in  NONE[G.].  For  brevity,  these  sets  will  be 
called  ALL,  iaOME,  and  NONE,  and  their  dependence 
on  Gj  will  be  implicit.  This  partition  is  trivial 
to  compute,  and  initially,  none  or  a  few  ax^  are 
in  ALL,  and  most  ax^  belong  to  SOME  and  NONE.  We 
want  to  develop  a  sequence  of  analogies  Gj,  j  =  1, 
...  n,  that  contain  an  increasingly  larger  set  of 
predicates  and  their  analogs.  If  an  axiom  is  con¬ 
tained  in  ALL,  then  by  definition  we  know  the  ana¬ 
logs  of  each  of  its  predicates.  It  cannot  assist 
us  in  learning  about  new  predicate  associations. 

In  contrast,  we  know  nothing  about  the  analogs  of 
any  of  the  predicates  used  in  axioms  contained  in 
NONE.  Analog  clauses  for  these  axioms  are  hard 
to  deduce  since  we  have  no  relevant  information  to 
start  a  search.  Unlike  these  two  extreme  cases, 
the  axioms  in  SOME  are  especially  helpful  and  will 
become  the  focus  of  our  attention.  For  each  such 
axiom  we  know  the  analogs  of  some  of  its  predi¬ 
cates  from  Gj.  These  provide  sufficient  informa¬ 
tion  to  begin  a  search  for  the  clauses  that  are 
analogous  to  them.  When  we  finally  associate  an 
axiom  with  its  analog,  we  can  match  their  respec¬ 
tive  descriptions  and  associate  the  predicates  of 
each  that  do  not  appear  on  GP.  We  can  extend  Gj 
to  and  thus  the  analogs  of  axioms  on  SOME 

provide  a  bridge  between  the  known  and  the  un¬ 
known.  between  the  current  G.  and  a  descendent 

}  J 

Vi-  When  EXTENDER  has  satisfactorily  terminated, 
ALL  =  AXSET,  SOME  =  NONE  =  f).  So  the  game  becomes 
finding  some  way  to  systematically  move  axioms 
from  NONE  to  SOME  to  ALL  in  such  a  way  that  for 
each  axk  moved,  some  analog  Gj [ax^]  =  axk  is  found 
that  can  be  used  in  the  proof  of  T  ■  Moreover, 
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each  new  association  of  clauses  should  help  us 
extend  G.  -*  G  by  providing  information  about 

J  J  •  J- 

predicates  not  contained  in  u  . 

j 

A  DETAILED  DESCRIPTION  OF  INITIAL-MAP 

At  heart,  ZORBA-I  is  a  heuristic  program  de¬ 
signed  to  generate  analogies  between  theorem  pairs 
stated  in  a  subset  of  predicate  calculus.  It  has 
been  designed  and  implemented  in  a  fairly  modular 
manner  to  facilitate  understanding  and  ease  of 
generalization.  Thus,  much  of  the  system  can  be 
described  in  algorithmic  terms.  In  this  section 
I  hope  to  blend  some  appreciation  of  the  heuristic 
foundations  of  the  program  while  describing  its 
operation  with  algorithmic  clarity.  ZORBA-I  uses 
an  interesting  set  of  searching  and  matching  rou¬ 
tines,  which  have  been  empirically  designed, 
generalized,  and  tested  on  a  set  of  problem  pairs 
(T^  -  T2  and  T3  -  T^  are  fair  representatives  of 
this  set).  The  control  structures  of  INITIAL-MAP 
and  EXTENDER  have  been  designed  to  pass  fairly 
similar  structures  to  the  various  match  routines 
(described  below).  Thus,  the  following  descrip¬ 
tions  will  cover  cases  where  the  structures  to  be 
mapped  are  fairly  similar.  For  example,  most  of 
the  routines  that  match  sets  of  items  assume  that 
the  sets  are  of  equal  cardinality  and  that  they 
will  map  one-one.  Such  assumptions  are  valid  for 
a  large  class  of  interesting  analogies  (such  as 
the  group-ring  analogy  in  abstract  algebra)  and 
simplify  the  description  of  the  various  proce¬ 
dures.  Analogies  that  require  weaker  assumptions 
and  more  complex  procedures  are  described  else¬ 
where.  (6) 

In  the  previous  section  I  motivated  the  design 
of  INITIAL-MAP  and  EXTENDER,  which  generate  a  re¬ 
stricted  analogy  and  expand  it  to  cover  all  the 
relations  and  axioms  necessary  for  the  new  proof. 
ZORBA-I  can  be  easily  expressed  in  terms  of  these 
two  functions  as  follows: 

* 

zorba^Cnewwf f ; oLdwff ; AXSET  ]:  = 

(1)  Set  analogies  to  the  list  of  analogies 
generated  by  initial  map[newwff ; oldwf f ] . 

(2)  Apply  extender[analogy;  AXSET]  to  each 
analogy  or  analogies . 

(3)  Return  the  resultant  set  of  analogies. 

The  preceding  description  allows  that  there  may 
be  more  than  one  analogy  generated  by  either 
INITIAL-MAP  or  EXTENDER.  In  practice,  however, 
each  tends  to  generate  but  one  (good)  analogy. 

In  the  following  paragraphs  I  will  describe 


* 

AXSET  is  the  set  of  axioms  that  appears  in 
proof  TT] . 


INITIAL-MAP  in  some  detail.  EXTENDER  will  be  dis¬ 
cussed  in  the  next  section. 

INITIAL-MAP  is  designed  to  take  two  first- 
order  predicate  calculus  wffs  and  attempt  to  gene¬ 
rate  a  mapping  between  the  predicates  and  variables 
that  appear  in  them.  The  variable  mapping  infor¬ 
mation  is  used  to  assist  INITIAL-MAP  in  mapping 
predicates  in  cases  of  seeming  ambiguity;  INITIAL- 
MAP  outputs  a  set  of  associated  predicates  that 
appear  in  the  statements  of  TA  and  T.  This  re¬ 
stricted  mapping  is  used  as  a  starting  analogy  by 
EXTENDER,  which  finds  a  complete  mapping  for  all 
the  predicates  used  in  proof [T] .  As  a  byproduct 
EXTENDER  finds  analogs  for  each  of  the  axioms  on 
AXSET.  INITIAL-MAP  (unlike  EXTENDER)  does  not 
reference  AXSET,  the  set  of  axioms  used  to  prove 
T,  and  is  symmetric  with  respect  to  caring  which 
wff  represents  the  proved  or  unproved  theorem. 
INITIAL-MAP  uses  atommatch[atomi;  atom2;  G]  as  a 
rule  of  inference  to  add  the  predicate/variable 
information  to  analogy  G.  As  its  name  hints, 
ATOMMATCH  matches  the  predicates  and  variables  of 
its  atomic  arguments  and  adds  the  resultant  mapping 
to  the  developing  analogy  (G). 

ATOMMATCH  is  used  as  an  elementary  operation 
by  every  matching  routine  in  the  INITIAL-MAP 
system  (Figure  3).  Thus,  we  will  discuss  it  first 


FIGURE  3  HIERARCHY  OF  MATCHING  ROUTINES 
CALLED  BY  INITIAL-MAP 


and  then  consider  how  INITIAL-MAP  is  organized  to 
apply  it  intelligently.  Consider  how  we  might 
write  an  ATOMMATCH.  Suppose,  atom^  and  atom2  are 
of  the  same  order  (same  number  of  variables)  and 
each  variable  place  in  each  atom  has  the  same  se¬ 
mantic  type.  For  example,  let 

aton^  =  intersectionfxj^;  x2;  x3] 

atom2  =  intersectionfy^  y^',  y^ 
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Clearly,  we  want 

intersection  **  intersection 
* 

and  Xi  **  yi  ’  i  =  1>  2,  3. 

So,  if  atomj^  =  p[xx;  .  .  .  xfi] 
and  atom,  =  q[y  ;  ...  y  ] 
and  p  =  q  (thus,  n  =  m) 
we  will  set  p  "  q 

and  x  “  y  ,  i  =  1.2,... ,n 

li 

So  far  ATOMMATCH  is  quite  trivial.  Suppose,  how¬ 
ever,  p  r  q  or  n  r  m. 

For  example,  let  atom^  =  group [x;  *  ] 

and  atom  =  ring  [y;  *  ;  +  ] 

2  2  2 

Clearly  we  want  to  associate  the  set  x  with  the 
set  y  and  the  operator  with  either  or  both  of 
*2  and  +  .  ATOMMATCH  can  know  which  variables 
represent  sets,  etc.,  by  checking  the  semantic 
templates  associated  with  group  and  ring.  Now, 
the  template  associated  with  group  is  structure 
[set;  operator]  while  that  associated  with  ring  is 
structure [set; operator; operator] .  We  will  map 
variables  with  each  other  so  as  to  preserve 
predicate  place  ordering  and  semantic  type.  To 
handle  the  unequal  number  of  variables,  we  will 
temporarily  expand  the  atom  group  [x;  *^]  to  in¬ 
clude  a  dummy  variable  of  type  operator, 
"dummyop,"  and  will  rewrite  it  as  groupfx, 
dummyop] .  The  symbol  "dummyop"  is  used  to  expand 
either  (or  both)  atoms  to  be  of  the  same  order 
and  a  variable  (possibly  dummy)  of  the  same  se¬ 
mantic  type  in  corresponding  places  in  each  atom. 
Then  we  can  map  the  variables  one-one  in  order  of 
appearance.  For  example  we  can  associate 

x  **  y 

and 

(  dummyop)  -(*2,  +g) 

Then  we  can  remove  dummyop  and  rewrite 


We  can  describe  this  process  formally'  in  two 
stages. 

(1)  Make  the  two  atoms  type-compatible  and 
of  the  same  order  by  adding  dummy 
variables  whenever  necessary. 

Let  atom,  =  p[x  ;  ...x  ] 

1  1  n 

atom  =  q[y  ;  ,..y  ] 

Jim 


I  will  use  a  double-headed  arrow  *w'  as  in 

"x  **  y"  to  mean  x  is  associated  with  (analogous 

to)  y." 


template  [atom  ]  =  type[p]  [type[x  ]  ...  type 

[x  ]]  1  1 

n 

template  [atom  ]  =  type[q]  [type[y  ]  type 

[y  ]]  1 

m 

Furthermore,  suppose  that  the  ordering  of  the  types 
is  the  same  in  each  template,  even  though  the 
number  of  variables  of  each  particular  type  need 
not  be  identical  for  corresponding  "type  blocks.” 
Thus,  in  the  preceding  example,  in  both  "group" 
and  "ring"  the  type  set  precedes  the  type  operator. 
Each  template  has  one  set  variable,  but  a  differing 
number  of  operator  variables.  Thus,  we  could  par¬ 
tition  the  ordered  set  of  variables  in  atomj  and 
atom2  by  letting  some  x^  and  xi+1  belong  to  the 
same  partition  if  type[x.  =  type[x .,,].  How  there 

i  ill 

are  an  equal  number  of  partitions  in  both  aton^ 
and  atom2-  Returning  to  our  example,  we  partition 
group[x;  ^  into  [[x],[*j]]  and  the  ring[y;  *2;  t-g] 
into  [  [y] ,  [  *2’ ^2^  ‘  (The  brackets  indicate  that 
the  order  of  elements  is  preserved.) 

(2)  Map  the  partitioned  subsets  into  each 

other,  preserving  their  order  within  the 
partitions,  and  map  elements  into  elements 
if  the  two  subsets  have  an  equal  number 
of  elements. 

This  completes  our  brief  description  of 
ATOMMATCH .  From  now  on,  we  will  consider  ATOMMATCH 
as  an  elementary  operation  that  will  expand  the  de¬ 
veloping  analogy  to  include  a  (possibly)  new  predi¬ 
cate  pair  and  (possibly)  new  pairs  of  variable 
associations.  We  need  to  know  how  to  select  pairs 
of  atoms  from  the  statements  of  T  and  T  to  be 
ATOMMATCHed. 

We  have  two  wffs  representing  T  and  T^  as 
arguments  of  INITIAL-MAP,  and  we  want  to  find  some 
way  to  slide  ATCWMATCH  over  pairs  of  atoms  se¬ 
lected  from  the  wffs.  First,  note  that  the  syntax 
of  the  wffs  may  be  a  helpful  guide  in  selecting 
potential  matches. 

Suppose  T:A  a  p(x) 

TA;B=q(y)  , 

where  A  and  B  are  any  wffs. 

We  would  presume  that  p  **  q  (predicates) 

x  •*  y  (variables) 

and  A~  B  (sub-wffs)  , 

where  we  expect  that  wffs  A  and  B  would  be  decom¬ 
posed  down  to  atoms  for  ATOMMATCH.  If  A  and  B 
had  implication  signs  in  them,  we  could  decompose 
them  similarly.  There  are  many  possibilities  for 
the  forms  of  T  and  T^.  We  find  that  if  T  and  T^ 
are  closely  analogous,  then  their  syntactic  forms 
are  likely  to  be  very  similar.  ZORBA,  considers 
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T  and  to  have  the  formats  that  can  be  repre¬ 
sented  by  the  generative  grammar  below 


T  -*  A  =  A 


A  p  [x  .  .  .  x  ]  a|  p [x  . . .x  ] 

In  In 

INITIAL-MAP  is  designed  to  decompose  the  in¬ 
put  wffs  T  and  into  associated  syntactic  sub¬ 
structures  until  a  subwff  is  either  an  atom 

p[x,  ...  x  ]  or  a  conjunction  of  atoms 
1  n 


k 

A 

i=l 


pi[Xl 


...  x  ] 
n 


At  this  point  it  enters  a  hierarchy  of  selecting 
and  matching  routines  (Figure  3)  to  decide  which 
pairs  of  atoms  shall  be  ATOMMATCHed.  Naturally, 
if  the  subwffs  are  just  atoms  it  calls  ATOMMATCH 
directly.  Otherwise,  it  enters  a  program  hier¬ 
archy  headed  by  a  routine  named  SETMATCH,  which 
selects  appropriate  atom  pairs  from  the  sets  of 
conjuncted  atoms  in  the  subwffs. 

In  the  following  discussion,  the  number  of 
atoms  conjuncted  in  each  set  are  assumed  equal 
(k  =  £).  SETMATCH  can  be  described  in  terms  of 
its  subfunctions  as  follows: 

* 

Setmatch  [set  ;  set  ;  ana]:  = 

1  2 

(1)  Partition  the  atoms  in  setj  and  set2 
into  subsets  that  have  identical  semantic 
templates  (a  "semantic  partition"). 

Thus  if  set^  is  group[x;*]  A  abelian 
[y;  *3  A  intersection^;  x;  y]  the  se¬ 
mantic  partition  will  be 
[  ( intersection[z;x;  y]  }  (group  [x;  *] , 
abelian[y;  *]  } }  since  group  and  abelian 
are  both  of  type  struct [set; op] . 

(2)  Select  the  partitions  of  set^  and  set2 
that  have  but  one  element  and  call  these 
sing^  and  sing2,  respectively. 

(3)  The  remaining  partitions  have  more  than 
one  element;  call  them  mult^  and  mult,,, 
respectively, 

(4)  Match  the  atoms  in  singj^  with  those  in 
sing2  by  executing  singlematch[sing^; 
singgjana] . 

(5)  Match  the  remaining  atoms  by  executing 
multimatch[multi;  mult2;  ana]. 

SETMATCH,  SINGLEMATCH,  and  MULTIMATCH  are  all 
heuristically  designed  one-pass  matching  strate¬ 
gies  that  make  strong  assumptions  about  the  na¬ 
ture  of  the  theorem  statements  T  and  T^  for  an 
analogous  theorem  pair. 


When  an  analogy  G  is  referenced  within  the  de¬ 
scription  of  an  algorithm,  it  will  be  represented 
as  a  variable  ana  wherever  that  is  more  convenient. 


SETMATCH  assumes  that  the  atoms  in  set^  and 
set2  will  map  one-one  and  that  the  semantic  parti¬ 
tions  will  map  one-one.  Suppose,  we  have  a  se¬ 
mantic  partition  thus: 


partition^  =  [[atora^  atom^}  (atom^  atom^}]  (atom^} 


partition  =  ([atom  atom  }  (atom  atom  }}  (atom 

2  6  7  8  9  10 

SETMATCH  assumes  that  (atonal  and  (atom^]  will 

correspond,  rather  than  (atom^}  and,  say  (atom6 

atom7! .  It  calls  SINGLEMATCH  to  map  the  single- 

atom  partitions  onto  the  single-atom  partitions. 


In  addition,  it  calls  MULTIMATCH  to  map,  in 
pairs,  the  partitions  containing  several  atoms 
each. 


MULTIMATCH  assumes  that  the  analogy  will  pre¬ 
serve  semantic  type  sufficiently  well  so  that 
atoms  within  a  particular  partition  will  corre¬ 
spond  only  to  atoms  in  one  other  partition. 

Thus,  if  [atom^, atonal  **  (atom^,  atom^} 

then  atom^  **  atom^  or  atom^ 

atom  **  atom  or  atom 
2  6  7 

It  forbids  matches  across  partitions,  such  as 


atom 

1 

atom^ 

atom^ 


atom 

6 

atomg 

atom  ,  etc. 
7 


SINGLEMATCH  and  MULTIMATCH  also  share  a  common 
default  condition.  If  all  but  one  of  the  elements 
of  a  set  X  are  mapped  with  all  but  one  of  the  ele¬ 
ments  of  a  set  Y,  then  these  two  elements  are  as¬ 
sociated  by  default  without  any  further  decision 
making.  In  SINGLEMATCH  the  sets  X  and  Y  are  sets 
of  atoms  or  partitions  of  atoms. 


SINGLEMATCH  [set^;  setg!  ana]  may  be  easily  de¬ 
scribed  in  terms  of  this  default  condition  and  a 
function  called  tempslftts^;  s,,;  testfn;  ana]  , 
TEMPSIFT  applies  testfn[x; y]  to  the  first  element 
of  5^  and  each  successive  element  y  of  s2  until 
it  finds  a  y'  e  s2  such  that  testfn[x;y/]  =  T. 

It  then  executes 

atommatch[x;  y ana]  , 

increments  to  the  next  element  of  x*  of  s^,  and 
seeks  another  y  s  s2,  such  that  testfn[x  ;y  ]  = 
T,  etc.  Thus,  for  every  x  e  s  ,  it  finds  the 
first  y  £  s2  such  that  testfn[x;y]  =  T  and  exe¬ 
cutes  atommatch[x; y]  =  T.  Typical  testfns  check 
whether  x  and  y  have  the  same  semantic  template 
or  are  analogs  of  each  other  according  to  the  de¬ 
veloping  analogy,  ana.  Single7natch[set1;  set,,; 
ana]  :  = 


(1)  If  setj  and  set,,  have  but  one  element 

("terminal  default  condition"),  go  to  8. 
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C2)  Execute  tempsif  t  [set.^;  set,,;  testfn^;  ana] , 
where  testfn1Cx;y]  is  true  iff  x  and  y 
have  the  same  semantic  template. 

(3)  If  set^  and  set2  are  empty,  go  to  9. 

If  the  terminal  default  condition  is 
true,  go  to  8. 

(4)  Execute  tempsif  t[set-^;  set2;  testfn2;  ana] , 
where  testfn2[x;y]  is  true  iff  the  pre¬ 
dicate  letter  in  atom  y  is  the  analog 
of  the  predicate  letter  of  that  in 
atom  x  according  to  analogy  ana. 

(5)  If  set^  and  set2  are  empty,  go  to  9. 

If  terminal  default  conditions  holds, 
go  to  8. 

(6)  Execute  tempsif t [set^;  set2; testfn^;  ana]  , 
where  testfn.j[x;y]  is  true  iff  the  type 
of  the  predicate  appearing  in  atom  x  is 
the  same  as  the  semantic  type  of  the 
predicate  appearing  in  atom  y. 

(7)  If  set^  and  set2  are  empty,  go  to  9. 

If  the  terminal  default  condition  holds, 
go  to  8.  Otherwise  print  an  error 
message  and  halt. 

(8)  Apply  ATOMMATCH  to  the  remaining  atoms 
of  set^  and  set2. 

(9)  STOP. 

To  illustrate  the  preceding  algorithm  with  a 
simple  example,  let 

set^  =  [intersection^;  y;  z]  ,  abeliangroup[x;  *]  ] 

set^  =  { intersection[u; vj w] , 

commutativering[u;  *;  +]  ] 

Step  2  associates 

intersection [x;  y;  z]  <=»  intersection  [u;  v;  w] 
and  the  terminal  default  condition  associates 

abeliangroup [x;  *]  <=  commutativering[u;  *;+] 

MULTIMATCH  is  a  little  more  complex  than 
SIN GLEMATCH .  First  we  need  to  decide  which  par¬ 
titions  are  to  be  associated  before  associating 
atoms  within  partitions.  Suppose  we  have  two 
sets  of  partitions  set1  and  setg,  If  both  sets 
have  but  one  partition  each  (a  common  case), 
then  we  expect  these  to  be  associated  by  default 
and  declare  them  accordingly.  Secondly,  if  in 
some  partition  of  setj  there  is  an  atom  with 
predicate  p  which  is  known  to  be  analogous  to 
predicate  q,  then  the  partition  in  set2  that  con¬ 
tains  q  should  be  associated  with  that  which  con¬ 
tains  p.  Remember  that  these  partitions  were 
constructed  on  the  basis  of  semantic  templates. 
Thus,  while  several  atoms  containing  a  predicate 
p  may  be  in  a  particular  partition,  there  will 
be  only  one  partition  that  contains  atoms  with 
predicate  p.  lastly,  if  in  set^  and  set2  there 
is  but  one  partition  that  contains  atoms  whose 


predicates  have  the  same  type,  e.g.,  STRUCTURE, 
then  we  expect  these  partitions  to  be  associated. 
Let  MULTIMATCH1  name  the  function  that  actually 
associates  atoms  within  a  partition  according  to 
analogy  ana, 

MULTIMATCH [set^;  set^  ana]  ;  = 

(1)  If  the  terminal  default  condition  for 
partitions  holds,  go  to  7. 

(2)  Let  pred[x]  =  the  predicate  letter  of 
atom  x.  For  each  partition  y,  sequence 
through  each  atom  x  £  y.  If  pred[x]  is 
on  analogy  ana  find  the  partition  z  e 
set2  such  that  the  analog  of  pred[x]  ap¬ 
pears  in  z.  Execute  MULTIMATCH1 
[y;z;ana]  for  each  such  pair  y,z. 

(3)  If  the  terminal  default  condition  holds, 
go  to  7.  If  set1  and  set2  are  empty,  go 
to  8. 

(4)  For  each  partition  y  £  set1,  select  the 
first  atom  x.  Find  a  partition  z  £  set2 
such  that  the  type  of  predicates  in  z 
equals  type  [x]  .  If  there  is  only  one 
such  z  £  set2,  execute  MULTlMATCHlCy; z; 
ana]  . 

(5)  If  the  terminal  default  condition  holds, 
go  to  7.  If  set-^  and  set2  are  empty,  go 
to  8. 

(6)  If  set1  or  set2  is  still  not  exhausted, 
print  an  error  message  and  halt. 

(7)  Apply  MULTIMATCH1  to  the  remaining  parti¬ 
tions  in  set^  and  set2. 

(8)  STOP. 

Each  set  of  atoms  in  a  partition  has  the  same 
semantic  template.  This  property  defines  a  par¬ 
tition.  Thus,  at  the  level  of  abstraction  pro¬ 
vided  by  the  templates,  all  of  these  atoms  are 
alike  and  any  differences  need  to  be  discriminated 
by  other  criteria.  Let  us  consider  an  example  to 
motivate  the  design  of  MULTIMATCH1.  The  theorem 
pair  Tg  -  T^  can  be  written  as: 

t'  V  (f>>m,x,  *  )  group [g;  *1]  A 

propernormal [m; g;  *  ]  A  factorstructure[x; g; m] 

A  simplegroup[x;  *^]  =  maxima lgroup  [m;  g;  *  ] 

T^  v(r;n;y;  *2+2)  ring  [r;  *2i+2i  A 

properideal [n; r; +  ]  A  f actorstructure[y; r; n] 

A  simplering[y;  *,,;+,,]  =  maxima lring[n;  r;  +  ] 

First  ZORBA-I  associates: 

maximalgroup  **  maximairing 

m  ~  n 

E  “  r 


IP. 
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when  it  decomposes  Tg  -  T^  into  subwffs  dis¬ 
tinguished  by  the  syntax  of  the  implication  sign, 
later  an  application  of  SINGLEMATCH  adds: 

propernormal  **  properideal 

factorstructure  **  factorstructure 

x  ~  y 

MULTIMATCH  is  passed  one  partition  from  each  wff. 
T^  contributes 

{group[g;  *l],  simplegroup[x;  *1])> 

and  T*  contributes 
4 

{ringEr;  *2<+2h  simpleringEy;  *2;+g]}- 

If  we  apply  the  MULTIMATCH  algorithm  just 
described  to  each  of  these  partitions,  we  find: 

Step  1.  We  do  not  satisfy  the  terminal  de¬ 
fault  condition. 

Step  2.  None  of  the  predicates  that  appear 
in  these  partitions  appear  on  the 
current  analogy.  We  gather  no  new 
information  here. 

Step  3.  We  still  do  not  satisfy  the  terminal 
default  condition. 

Step  4.  We  want  to  use  MULTIMATCH1  to  asso¬ 
ciate  the  atoms  in  these  partitions. 

Of  these  two  partitions,  the  former  pair  have  the 
template  structure [set; operator]  and  the  latter 
pair  have  structure[set;  operator;  operator]  . 
Fortunately,  our  analogy  has  variable  mapping  in¬ 
formation  that  is  quite  relevant  here.  We  know 
that: 

E  ~  r 

x  -  y 

We  can  assume  that  if  some  variable  appears  in 
only  one  atom  in  a  partition,  the  analogous  atom 
is  one  that  contains  its  analog  variable,  if  it 
too  appears  in  only  one  atom.  Fof  example,  the 
variable  "g"  appears  only  in  groupfgj *^] ,  and  its 
analog  ”r"  appears  only  in  ring[r;  *2;  +  ]  .  So,  we 
deduce: 

group [g;  *  ]  ~  ring[r;  +  ] 

A  similar  argument  based  upon 
x  **  y 

leads  us  to  deduce: 

simplegroup[x;  *  ]  **  siraplering[y;  *2'>+2^ 

although  we  could  have  also  deduced  this  last 
association  by  our  terminal  default  condition. 
Notice  that  "  is  not  a  discriminating  variable 
since  it  appears  in  both  group[g;*^]  and  simple- 
group[x;*^].  After  each  atom  pair  is  associated, 
we  apply  ATOMMATCH  to  it  to  deduce  more  variable 
associations  and  update  our  analogy. 


The  preceding  description  of  M ULTIMA TCH1  can 
be  simplified  and  generalized  by  realizing  that 
we  are  just  using  a  specialized  submap  of  the  de¬ 
veloping  analogy  to  extend  it  further.  This 
special  submap  is  just  that  mapping  of  variables 
where  each  variable  appears  in  only  one  atom  of 
the  partition.  In  the  preceding  example,  the 
submap  was  just: 

S  **  r 


x 


y 


MultiraatchlEpartition^; partition^;  ana] :  = 


(1) 

(2) 

(3) 

(4) 


(5) 


Set  to  a  list  of  variables  that  appear 
in  only  one  atom  of  partition^. 

Set  i l  to  similar  list  computed  on 


partition^- 

Set  anaprs  =  [x*  —  y,|x/  e  i ,  y'  e  2^ 

and  y'  is  the  analog  of  x*  by  ana}. 
Execute  temps if t [partition^;  part it ion2; 
testfna;  ana],  where  testfn.[u;v]  is 

^  It 

true  iff  for  some  variable  pair  x  y  e 
anaprs  variable  x'  appears  in  atom  u  and 
variable  y  appears  in  atom  v. 

STOP. 


INITIAL-MAP  has  been  completely  described. 

At  this  point  we  have  sufficient  machinery  to 
generate  a  mapping  between  the  predicates  and 
variables  that  appear  in  the  statements  of  theorem 
pairs  such  as  T^  -  Tg  and  Tg  -  Tj.  Next  we  want 
to  extend  this  mapping  to  include  all  the  predi¬ 
cates  that  appeared  in  the  proof  of  the  proved 
theorem  T  and  are  likely  to  appear  in  the  proof 
of  the  new  theorem  T^,  In  addition,  we  would  like 
to  pick  up  a  small  set  of  axioms  adequate  for 
proving  T^.  EXTENDER  performs  both  functions. 


A  DETAILED  DESCRIPTION  OF  EXTENDER 


In  the  last  section  I  described  INITIAL-MAP 
in  substantial  detail.  In  comparison,  EXTENDER 
is  a  far  more  complex  and  subtle  system  which  I 
will  explicate  here  less  completely.  I  intend  to 
accomplish  several  simple  aims  with  this  limited 
exposition: 

(1)  Expose  the  reader  to  the  motivation  and 
rationale  underlying  the  EXTENDER  design. 

(2)  Convey  some  appreciation  for  the  flavor 
of  some  well-specified  computational  al¬ 
gorithms  for  creating  an  analogy. 

(3)  Provide  an  intelligible,  self-contained, 
introductory  account  of  EXTENDER  adequate 
for  the  general  reader,  and  motivate  the 
more  sophisticated  specialist  to  consult 
a  more  complete  exposition.  (6) 
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The  rationale  of  EXTENDER  depends  upon  a  few 
simple  related  ideas.  I  will  begin  by  explicating 
these,  then  develop  MAPDESCR — the  clause  descrip¬ 
tion  mapping  operation — and  conclude  with  a  dis¬ 
cussion  of  two  simple  versions  of  EXTENDER. 

In  the  last  section  I  suggested  that  our  com¬ 
plete  analogy  could  be  seen  as  the  last  map  G^ 
in  a  series  Gj  of  increasingly  more  complete 
analogies.  Although  we  may  be  developing  several 
such  series  in  parallel,  they  all  begin  with  the 
same  G^ — the  analogy  produced  by  INITIAL-MAP. 

Each  Gj  maps  some  subset  of  the  predicates  that 
appear  in  the  proof  of  theorem  T.  Each  distinct 
subset  will,  in  general,  lead  to  a  different  par¬ 
tition  of  AXSET  into  (ALL,  SCME,  NONE] .  When  we 
search  for  the  analog  of  an  axiom  (clause),  we 
will  look  for  some  clause  that  satisfies  the 
analog  of  its  description  under  the  current 
analogy.  Each  clause  has  a  unique  description, 
descr[c],  which  has  been  introduced  in  a  previous 
section.  We  will  denote  the  analog  of  descr[c] 
by  some  analogy  Gj  as  Gj [descrfc] ] .  G^[descr[c]] 
is  equal  to  a  copy  of  descr[c]  in  which  every 
predicate  that  appears  in  Gj  is  replaced  by  its 
analogous  predicate.  Predicates  that  are  absent 
from  Gj  are  left  untouched.  For  example,  suppose 
we  have  a  trivial  Gj: 

G^:  abelian  ~  commutativering 

c^i  — i  abelian [x;  *]  V  group  [xt  *] 

d?:  neg[abelian], pos[group] .  =  descrtc^] 

G^[d^]  =neg  commutativering], pos[group] . 

Suppose  we-  are  seeking  to  extend  Gj^  by  finding 
the  analog  of  C7.  It  is  quite  unlikely  that  we 
will  find  a  clause  that  satisfies  this  descrip¬ 
tion,  (G^[d^] ),  since  it  would  be  derived  from 
some  (rare)  theorem  that  relates  a  condition  on 
commutative  rings  to  a  group  structure.  In  any 
event,  it  would  not  be  an  analog  of  Cy.  If  we 
sought  all  the  clauses  that  satisfied  negtcommu- 
tativering],  we  would  be  sure  to  include  Cg  and 
Cg,  which  at  least  include  Cg,  the  clause  we 
desire, 

c  :  — i  commutativering[x;  *; +]  v  ring[x;*;+] 

O  * 

cg:  -1  commutativering [x;  *;  +]  V  commutative  [*;  x]  . 

Thus,  sometimes  we  want  to  search  for  clauses  that 
satisfy  descriptions  with  features,  e.g., 
neg[commutativering],  that  contain  only  predicates 
that  appear  on  a  particular  analogy  Gj .  Now, 
what  we  are  doing  is  a  four-step  process: 


(1) 

(2) 


Make  a  description  d  for  an  axiom 
clause  c,  descr[c] . 

Create  an  analog  description  Gj[descr[c]] 
for  the  current  analogy,  G^, 


(3)  Delete  from  Gj[descr[c]]  any  feature  that 
contains  a  predicate  that  does  not  appear 
in  G j .  Denote  this  restriction  of 
Gj[descr(c)l  to  Gj  by  Gj [descr(c)] . 

(4)  Search  the  data  base  for  clauses  that 
satisfy  Gj [descr(c)] . 

In  our  example,  G^fdescrCc,^]  =  G^dy]  = 
neg[commutativering] .  G.[descr(c)]  is  a  "restric¬ 
tion  of  the  analog  of  the  description  of  c  to 
analogy  GP."  Since  this  phrase  is  quite  cumber¬ 
some,  we  will  simply  call  it  a  "restricted  descrip¬ 
tion"  and  implicitly  understand  its  dependence  on 

QP. 

J 

At  different  times  EXTENDER  may  seek  clauses 
that  satisfy  a  complete  analogous  description 
Gj [descr]  or  just  a  restricted  one  Gj[descr].  In 
summary,  EXTENDER  relies  upon  four  key  notions: 

(1)  An  ordered  sequence  of  partial  analogies 

Gj‘ 

(2)  A  partition  of  the  axioms  used  in  proof 
[T]  (AXSET)  into  three  disjoint  sets: 

ALL,  SOME,  and  NONE. 

(3)  A  search  for  clauses  that  satisfy  the 
analogs  of  the  description  of  the  clauses 
in  proof [T] . 

(4)  A  restriction  of  our  descriptions  rela¬ 
tive  to  an  analogy  Gj  by  including  only 
those  features  with  predicates  that  ap¬ 
pear  in  Gj. 

INITIAL-MAP  used  an  operation  called 
ATOMMATCH  in  a  rather  clever  way  to  extend  its 
current  analogy.  Likewise,  EXTENDER  uses  an 
operation  called  MAPDESCR  for  a  similar  purpose. 
Both  operations  use  abstract  descriptions  in  order 
to  associate  their  data:  ATOMMATCH  uses  the  se¬ 
mantic  template  associated  with  a  predicate,  and 
MAPDESCR  uses  the  description  of  the  clauses  it 
is  associating.  EXTENDER  and  INITIAL-MAP  differ 
in  that  EXTENDER  generates  a  new  partial  analogy 
each  time  it  activates  MAPDESCR  (and  the  resultant 
mapping  is  new)  while  INITIAL-MAP  uses  ATOMMATCH 
to  expand  one  growing  analogy. 

Each  partial  analogy  Gj  is  derived  from  its 
antecedent  Gj_^  by  adding 

(1)  An  association  of  one  clause  ax^  e  SCME 
with  one  or  more  clauses  from  the  data 
base . 

(2)  An  association  of  the  predicates  in 
those  clauses. 

A  simple  example  will  illustrate  this  amply.  If 
G^  is  the  initial  analogy  generated  by  INITIAL- 
MAP  applied  to  the  pair  of  theorems  T^-T2,  its 
predicate  map  is 
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abelian  *"  commutativering 
intersection  "  intersection. 


Suppose  we  know  that  **  Cg.  We  would  like  to 
extend  G^  to  &2  by  adding: 

(1)  c7  ~  c8 

(2)  abelian  **  commutativering 

group  **  ring. 


To  motivate  the  structure  of  JIAPDESCR,  let 
us  design  a  version  of  it  that  would  enable  us 
to  extend  G^  to  G2  in  this  example.  MAPDESCR  is 
charged  with  mapping  neg[abelian] ,  posfgroup] 
(d?)  with  negfcommutativering] ,  postring],  when 
it  knows  that: 


G  ; 

1 


abelian  commutativering 
intersection  **  intersection. 


First,  we  can  eliminate  neg[abelian]  from  d?  and 
neg [commutativering]  from  dg  on  the  basis  of  G^, 
which  associates  "abelian"  and  "commutativering." 

GjfnegCabelian]]  =  neg[commutativering]] , 

Now  we  are  simply  left  with  associating  pos[group] 
and  posfring].  Since  these  are  the  only  two 
elements  left,  have  the  same  semantic  type 
(STRUCTURE),  and  have  the  same  feature  (pos),  we 
can  map  them  by  default  and  add 


group 


ring 


to  G  . 
2 


Now,  we  can  write  a  version  of  MAPDESCR 
which  accepts  as  arguments  two  clause  descriptions 
and  an  analogy  G^: 

mapdescr[descr  ; descr  ;G  ]:= 

1  2  j 

(1)  Vx  x  e  descr  s.t.  G  [x]  c  descr  , 

1  3  2 

delete  x  from  descr^  and  G  [x]  from 
descr  .  Thus,  we  exclude  all  those 


features  we  know  about  from  G., 

J 

(2)  Vx  x  e  descr^  and  x  S  descr^,  map  the 

predicate  that  appears  in  x  into  itself 


(4) 


(b)  If  more  than  one  element  of  descr 


and  descr  have  the  same  feature, 
2  ’ 


e.g.,  pos,  then  discriminate  within 
these  elements  on  the  basis  of  the 
semantic  types  of  their  predicates. 
Return  the  resultant  list  of  paired 
predicates . 


Most  often  in  my  algebra  data  base  a  clause  de¬ 
scription  consists  of  two,  three,  or  four  features. 
EXTENDER  ensures  that  some  of  the  predicates  in 
any  pair  of  clauses  passed  on  to  MAPDESCR  are  on 
G j .  Thus,  by  the  time  we  reach  step  3  of  the 
MAPDESCR  algorithm  we  often  have  descriptions  of 
length  one,  which  map  trivially  by  default,  or 
descriptions  of  length  two  with  different  features, 
e.g.,  pos  and  neg.  Thus,  step  3b,  which  requires 
disambiguation  based  upon  predicate  types,  occurs 
rarely  in  this  domain  (abstract  algebra). 

When  MAPDESCR  returns  a  list  of  predicates 
pairs  that  result  from  mapping  the  description  of 
a  clause  c^(descr^,  above)  with  the  description 

of  a  clause  c^descr^,  above)  according  to  analogy 

Gj,  it  creates  a  new  analogy  Gj^j.  G  is  the 
same  as  Gj  except  that 

(1)  Its  predicate  map  is  the  union  of  the 
one  returned  by  MAPDESCR  and  the  one 
appearing  on  Gj. 

(2)  Its  clause  mapping  is  the  union  of  the 
one  appearing  on  Gj  and  c-^  **  c2* 

Thus,  when  EXTENDER  is  attempting  to  extend 
G.,  it  creates  a  new  analogy  G  G.  etc.  for 
each  clause  pair  it  maps  when  4hose  clauses  were 
selected  on  the  basis  of  information  in  G  j ,  Of 
course,  there  is  a  procedure  to  see  whether  the 
predicate  associations  of  a  new  analogy  have  ap¬ 
peared  in  some  previously  generated  analogy  and 
thus  prevent  the  creation  of  redundant  analogies. 

In  this  case  the  two  corresponding  clauses  are 
added  to  each  existing  analogy  for  which  the 
predicate  pairs  returned  by  MAPDESCR  are  a  subset 
of  its  clause  map. 


and  delete  x  from  descr  and  descr  . 

_ 1  _ 2 

(3)  In  the  remnants  cf  descr  and  descr  : 

_ 1  _ 2 

(a)  If  there  are  unique  elements  of 

descr^  and  descr2  that  have  the 

some  feature,  e.g.,  pos,  and  se¬ 
mantically  compatible  predicates, 
associate  those  terms  and  delete 
them  from  the  remnant  descriptions. 
Here  "semantic  compatibility"  means 
"same  semantic  type." 


After  I  explicate  one  additional  idea  I  can 
describe  a  simple  version  of  EXTENDER.  When 
EXTENDER  is  extending  Gj  it  is  searching  the 
large  data  base  for  some  clause  that  is  the  analog 
of  an  axiom  c^  s  SOME.  Now  we  could  search  for 
the  set  of  clauses  that  satisfy  GjCdescrCc^]], 
but  we  will  run  into  the  difficulty  described 
earlier  in  this  section.  Thus,  we  search  for 
clauses  that  satisfy  Gj [descr [c^] ] .  If  Gj  con¬ 
tains  the  correct  analog  for  each  predicate  that 
appears  on  It,  then  the  set  of  clauses  C  that 
satisfy  G.  [descr[c.  ]  ]  is  guaranteed  to  contain 

V  “  ||  II 

the  desired  analog  of  c  (  image  of  c  ) .  We 

k  k 
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will  refer  to  C  as  the  "candidate  image  set.” 
Suppose  that  C  has  but  one  member,  c  .  Then  we 
know  that  c  is  the  analog  (image)  of  ck  and 
should  extend  Gj  ^j+1  by  associating 

.  i 

c  c 
k 


When  the  set  of  clauses  that  satisfies  a  re¬ 
stricted  description  contains  only  one,  we  are 
guaranteed  that  it  is  the  image  clause  we  seek 
if  G[|  does  not  contain  any  erroneous  associations. 
Now,  if  C  is  empty,  we  have  reason  to  suspect 
the  correctness  of  G?  and  we  ought  to  stop  de¬ 
veloping  this  branch  of  the  analogy  search 
space.  On  the  other  hand,  if  C  has  more  than 
one  member,  and  GP  is  correct,  we  know  that  our 

’  j  ’ 

desired  image  is  in  C.  If  we  have  a  clause  c 
with  description  descr[c]  and  some  analogy  Gj 
that  contains  only  one  of  the  predicates  in  c, 
then  Gj[descr[c]]  will  have  but  one  feature  and 
many  clauses  will  satisfy  it.  If  some  later 
analogy  G^  (G^  c  G^)  includes  another  predicate 
from  c  in  addition  to  the  one  on  Gj,  then 
G^CdescrCc]]  will  have  two  features  and  will  be 
satisfied  by  fewer  clauses  than  Gj [descr[c] ] . 

Thus,  as  sequence  of  analogies  evolve,  each 
clause  will  have  decreasingly  fewer  candidate 
images  that  satisfy  its  restricted  description. 


To  search  for  the  clauses  that  satisfy  the 
analog  of  a  restricted  (short)  description, 
EXTENDER,  invokes  an  operator  shortdescr[G .] . 
SHORTDESCR  is  dependent  on  Gj  in  three  ways: 

(1)  It  searches  for  the  analogs  of  clauses 
that  appear  on  SOME  (which  is  different 
for  each  Gj). 

(2)  It  generates  descriptions  that  include 
only  the  predicates  that  appear  expli¬ 
citly  in  Gj. 

(3)  It  uses  the  predicate  map  G., 

SHORTDESCR  returns  a  (possibly  empty)  list  of 
axioms  (from  SOME),  each  of  which  is  paired  with 
a  set  of  clauses  from  the  data  base  which  satisfy 
the  analog  of  its  restricted  description.  Each 
axiom  is  guaranteed  to  have  its  analog  under  Gj 
in  its  associated  "candidate  image  set."  If  we 
find  no  candidates  at  all,  for  any  axfc  e  SOME, 
then  we  know  that  Gj  contains  some  wrong  predi¬ 
cate  associations,  and  we  ought  to  mark  it  as 
"infertile"  and  discontinue  attempting  to  extend 
it.  Of  the  images  we  find,  we  prefer  those 
axiom-candidate  associations  with  but  one  candi¬ 
date  image.  If  we  apply  MAPDESCR  to  each  such 
pair,  we  can  be  sure  that  we  have  a  consistent 
extension  of  G j .  Let  us  consider  a  primitive 
version  of  EXTENDER,  EXTENDER1,  which  exploits 
these  few  ideas. 

EXTENDI  [G^AXLIST]  :  = 


(1)  Let  analist  =  (G^),  the  set  of  active 
analogies. 

(2)  If  G.  is  complete,  STOP. 

(3)  Partition  AXLIST  into  {ALL, SOME, NONE } 
relative  to  G j . 

(4)  Set  imlist  to  shortdescr [G . ] .  If 
imlist  =  J0,  mark  Gj  as  BARREN  and  go  to  7 

(5)  Set  unimages  to  the  subset  of  imlist  that 
has  only  one  candidate  analog  for  each 
axiom.  If  unimages  =  0,  go  to  7. 

(6)  Apply  MAPDESCR  to  each  axiom  and  its 
analog  that  appears  on  unimages .  If 
MAPDESCR  adds  a  new  analogy,  add  it  to 
the  end  of  analist. 

(7)  If  analist  is  empty,  STOP.  Otherwise, 
set  Gj  to  the  next  element  on  analist . 

Go  to  2. 

The  success  of  EXTENDI  is  highly  dependent 
upon  the  clauses  in  the  data  base.  If  there  are 
few  clauses  then  it  is  likely  that  some  axk  e 
SOME  will  have  but  one  image  under  SHORTDESCR  at 
each  iteration  and  that  EXTENDI  will  be  successful 
As  the  data  base  increases  in  size  with  ever  more 
clauses  involving  predicates  that  will  appear  in 
proof[TA],  then  it  becomes  more  likely  for 

SHORTDESCR  to  generate  several  imaees  for  every 
ax^  €  SOME  in  some  iteration.  At  this  point  it 

will  fail  to  EXTEND  G.  and  miss  the  analogy  al¬ 
together.  To  remedy  ^his  situation,  we  need  a 
way  for  dealing  with  cases  when  SHORTDESCR  returns 
several  candidate  images  for  each  axk  e  SOME.  We 
need  some  way  to  select  the  clause  from  the  can¬ 
didate  set  that  is  most  likely  to  be  the  analog 
we  seek.  When  EXTENDER  meets  a  situation  of  this 
sort,  it  orders  all  the  images  according  to  their 
likelihood  of  being  analogous  to  the  axk  e  AXSET 
with  which  they  are  paired.  I  will  initiate  the 
description  of  one  such  ordering  relation  by  a 
simple  example. 

Consider,  for  example,  the  clause  c  and  an 
analogy  Gg  that  includes 

intersection  **  intersection 

subgroup  **  subring 

abeliangroup  ~  commutativering 

c  :  subgroup[x;  y;  *]  V  — i  group[x;  *]  V  — t 
group[y;  *]  V  — i  subset[x;y] 

d  =  neg [group],  neg[subset],  pos [subgroup] 

G^d  ]  =  pos[subring] . 

Suppose  our  data  base  contains  two  clauses  c^ 

and  c  that  satisfy  G  [d  ]: 

12  2  10 

c  :  subring[m;  r;  *; +]  V  — ,  ideal  [m;r;*;+] 

d  =  neg[ideal],  pos[subring] 

c^:  subring[x;a;  *;  +]  V  -i  ring[a;  *;+] 

V  —i  ring[x;  *;+]  V  — i  subset [x; a] 

d  =  neg[ring],  neg£subset],  pos  [subring]  . 
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We  can  compare  and  c ^2  by  comparing  and 

d^2  with  d^Q  (relative  to  G2).  We  want  a  partial 
ordering  of  a  set  of  descriptions  relative  to  a 
target  description  and  a  particular  analogy,  e.g., 
a  [dj^;  d2;  d;  G-j]  ,  that  orders  description  d^ 
with  respect  to  d2-  A  simple  ip,j  can  be  developed 
as  follows: 

Let  d'  =  d  -  G  [d] 

1  1  j 

d2  =  d2  " 

d;=  d  -  G  [d] 

j 

For  d1  and  d2  compute  the  number  of  features, 
e.g.,  pos,  in  common  with  d7 .  The  description 
with  the  most  features  in  common  is  closest  to  d. 


In  our  example,  we  have 

d|Q  =  neg[group],  negfsubset] 

d'*  =  neg[ideal] 

d*  =  negfring],  neg[subset]. 


Clearly  d^  is  closer  to  d  than  d 
our  closest  description  and  c 


select  d 

12 

image  of  c 


10 


12 


1? 


_  12 
under  G  .  After  MAPDESCR  maps 

will  add: 


so  we 
as  the 


group  °  ring 
subset  °  subset 


to  G 


2 


to  create  an 


G  : 
3 


C  : 
3 


Intersection  °  intersection 
subgroup  °  subgroup 
group  °  ring 
subset  w  subset. 


A  more  sophisticated  <pd  can  look  at  the  semantic 
types  of  predicate  that  share  common  features  if 
two  descriptions  are  equivalent  under  the  simple 
described  above.  EXTENDER  uses  an  operator 
called  MULTIMAP  to  select  the  best  image  (using 
<pd)  for  a  clause  that  has  several  candidates 
images  with  a  restricted  description  under  G  . 
Exploiting  this  notion,  we  can  write  a  more 
powerful  EXTENDER  called  EXTEND2, 


EXTEND2  [G  ;AXSET]:  = 

(1)  Let  anallst  =  (G^  ...  Cj),  the  list  of 
active  analogies.  Start  with  analist  = 
(Gj). 

(2)  If  Gj  is  complete,  STOP. 

(3)  Partition  AXSET  into  {ALL,  SOME,  NONE? 
relative  to  Gj . 

(4)  Set  imlist  to  shartdescr[0  ] .  If 
imlist  =  ft,  mark  Cj  as  ''infertile”  and 
go  to  8. 

(5)  Set  unlmages  to  the  subset  of  imlist 
that  has  only  one  candidate  analog  for 
each  axiom.  If  unimages  =  &,  go  to  7. 


(6)  Apply  MAPDESCR  to  each  axiom  and  its 
analog  that  appears  on  unimages .  If 
MAPDESCR  adds  a  new  analogy,  add  it  to 
the  end  of  analist.  Co  to  8. 

(7)  Apply  MULTIMAP  to  imlist  to  select  an 
optimal  candidate  image  under  for 
each  axiom.  Set  unimages  to  this  list 
of  axioms  paired  with  best  candidates. 

Go  to  6. 

(8)  If  analist  is  empty,  STOP.  Otherwise, 
set  Gj  to  the  next  element  on  analist . 

Go  to  2, 

This  version  of  EXTENDER  is  quite  powerful 
and  will  handle  a  wide  variety  of  theorem  pairs. 
The  reader  who  is  interested  in  the  behavior  of 
EXTENDER  in  generating  the  sequency  Gj  is  referred 
to  a  more  detailed  report  (6)  for  case  studies  and 
further  explication.  The  implemented  versions  of 
EXTENDER  are  far  more  complex  than  these  simpli¬ 
fied  tutorial  versions.  They  (1)  allow  backup, 

(2)  have  operations  for  combining  a  set  of  partial 
analogies  into  a  "larger"  analogy  consistent  with 
all  of  them,  (3)  have  a  sophisticated  evaluation 
for  deciding  which  particular  axiom-candidate  set 
to  pass  to  MULTIMAP  (in  lieu  of  step  7  above),  and 
(4)  can  often  localize  which  predicate  associa¬ 
tions  are  contributing  to  an  infertile  analogy 
when  one  is  generated.  Table  2B  contains  a  brief 
summary  of  ZORBA-I's  behavior  when  it  is  applied 
to  five  T-Ta  pairs  drawn  from  abstract  algebra. 

The  number  of  partial  analogies  generated  in¬ 
cludes  G^  generated  by  INITIAI^MAP. 

Table  2A 

THEOREMS  REFERENCED  IN  TABLE  2B 

Tl.  The  intersection  of  two  abelian  groups 
is  an  abelian  group. 

T2 .  The  intersection  of  two  commutative 
rings  is  a  cummutative  ring. 

T3.  A  factor  group  G/H  is  simple  iff  H  is  a 
maximal  normal  subgroup  of  G. 

T4.  A  quotient  ring  A/C  is  simple  iff  C  is  a 
maximal  ideal  in  A. 

T5.  The  intersection  of  two  normal  groups  is 
a  normal  group. 

T6.  The  intersections  of  two  ideals  is  an 
ideal. 

T7.  The  homomorphic  image  of  a  subgroup  is 
a  subgroup. 

T8.  The  homomorphic  image  of  a  subring  is  a 
subring. 

T9.  The  homomorphic  image  of  an  abelian 
group  is  an  abelian  group. 

T10.  The  homomorphic  image  of  a  commutative 
ring  is  a  commutative  ring. 
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NECESSARY  CONDITIONS  FOR  AN  ANALOGY 

ZORBA-I  has  three  necessary  conditions  for 

creating  an  analogy.  The  first,  created  by  the 

form  of  ATOMMATCH,  pertains  to  the  form  of  the 

statements  of  T  and  T, . 

A 

(1)  In  the  statements  of  T  and  TA,  atoms 
must  map  one-one  from  T  to  T  . 

Notice  that  we  do  not  Insist  that  predicates  map 
one-one.  Consider  an  INITIAL-MAP  between 

Tl:  The  intersection  of  two  abelian  groups 

is  an  abelian  group 

and 

T5:  The  intersection  of  an  abelian  group 

and  a  commutative  ring  is  an  abelian 
group . 


Tl*:  abelian  Oq;  *^3  A  abelian  [b;  *  ]  A 
intersection [c; a;b]  =  abelian[c;  *  ] 

T5*:  abelian[x;  *2]  A  cringfy;  *2> -r2]  A 

intersectionfz; x; y]  =  abelian[z;  *  ] 

ATOMMATCH  can  map 

abelian[c;  **  abelian[z;  *  ] 

and  abeliantb;  *^]  **  cring[y;  +^] 

at  different  times  and  handle  many-one  predicate 
maps.  However,  the  EXTENDER  would  need  to  know 
(and  it  does  not  yet)  how  to  handle  this  ambiguous 
information . 


The  second  restriction  is  created  by  the  ex¬ 
tension  of  the  analogy  by  finding  image  clauses 
that  satisfy  the  incrementally  improved  analogy. 

To  state  this  condition  on  the  image  clauses  in  a 
formal  way,  I  need  to  introduce  some  simple  termi¬ 
nology.  Let  us  say  that  a  clause  c  bridges  a  set 
of  predicates  P^  to  another  set  of  predicates  P2 
iff:  1 


P1  U  preds [C]  =  P g 
P^  H  preds [C]  r  & 


and  (redundantly) 

P2  H  preds  [C]  r  jS 

P„  r  P„ 

2  2 

Now  consider  two  clauses,  c^  and  c2<  We  will  say 
that  c^  and  c2  bridge  from  P^  to  P2  if  S  p'  and 
c.  bridges  from  P,  to  P*  and  c„  bridges  from  P' 
to  P-^  c  P  c  P2.  In  general,  we  will  say 

that  an  unordered  set  C  of  k  clauses  bridges  from 
Px  to  P2  iff  3  p',  v'2  ...  P^ 

such  that: 


(1) 

(2) 

(3) 

Now  let: 


c  E  C  and  c  bridges  from  P,  to  P 
1  1  11 


V  x  =  2 
3  J 


k-1  and  c^  e  C 


c  bridges  from  P  to  P 

j  j  j+1 

3c  EC  and  c  bridges  from  P  to  P  . 
k  k  k-1  2 


preds [T]  =  predicates  used  in  proof  of  T. 

Pr[T)  =  predicates  used  in  statement  of  T. 

G  =  analogy  from  T  to  T  . 

A 

descr[c]  =  description  of  clause  c. 

G[descr[c]}  =  analog  description  of  the 
description  of  c  under  G. 


AXSET  =  axioms  used  in  proof  of  T. 
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(2)  A  necessary  condition  for  the  EXTENDER 
to  work  is  that: 

(a)  3  c  c  AXSET  and  c  bridges  Pr(T)  to 
preds[T]  . 

(b)  and  if  G[c)  =  [c  ,  c  satisfies 
G[descr[c;]]  for  some  c'  e  c 

G[c]  bridges  from  Pr[T  ]  to  preds[T  ]. 

A  A 

More  verbally,  some  subset  of  the  axioms  in 
the  proof  of  T  that  bridge  R  the  domain  of 
INITIAL-MAP  to  preds[TA]  has  a  set  of  image 
clauses  under  G  that  bridge  the  images  of 
INITIAL-MAP  to  preds[T^].  Thus,  the  proofs  need 
not  be  isomorphic,  merely  that  some  subset  of  the 
axioms  have  a  nearly  isomorphic  set  of  image, 
axioms,  similarly  restricted  to  the  bridging 
condition. 

This  bridging  condition  may  seem  rather  non- 
intuitive  from  the  vantage  point  of  choosing  a 
data  base,  but  it  should  be  clear  that  EXTENDER 
imposes  this  condition. 

To  develop  analogies  in  domains  that  are  de¬ 
scribed  by  predicate  calculus  with  constants 
would  require  wholly  different  analysis  algorithms. 
Consider  a  robot  that  is  instructed  to  go  from 
SRI  to  (1)  an  office  on  its  floor,  (2)  Stanford 
University,  (3)  San  Francisco,  (4)  New  York  City, 
(5)  Chicago.  These  five  problems  could  be  stated 
to  QA3  as 


T  . 

3s 

at 

[robot; 

office  ; 

s  J 

10 

f 

5 

f 

T 

11 

at 

[robot; 

Stanford; 

Sf] 

T 

12 

at 

[robot; 

San  Francisco;  sf] 

T 

13 

•  Ssf 

at 

[robot; 

NYC;  sf] 

T 

14 

3s 

f 

at 

[robot; 

Chicago; 

v  • 

By  trivial  syntactic 

matching 

we  could  asso- 

date  office5  with  Chicago,  Stanford  with  San 
Francisco,  etc.  The  robot’s  actions  to  get  from 
SRI  to  Stanford  or  San  Francisco,  New  York  City, 
or  Chicago  are  pairwise  similar.  But  the 
INITIAL-MAP  or  extender  would  have  to  know  the 
''semantics"  of  these  (geographic)  constants 
(with  respect  to  SRI)  and  the  robot's  actions  to 
assess  which  problems  are  adequately  analogical 
and  which  action  rules  should  be  extrapolated  to 
the  unsolved  problem. 

RELATIONSHIP  BETWEEN  ZORBA-I  AND  QA3 

In  the  preceding  section,  I  have  discussed 
the  organization  and  use  of  ZORBA-I  independently 
of  QA3.  In  this  section,  I  merely  svant  to  note 
how  change  in  QA3  can  affect  the  way  in  which  the 
analogical  information  output  by  ZORBA-I  can  be 
used . 


The  present  version  of  ZORBA-I  outputs  a  set 
of  clauses  that  it  proposes  as  a  restricted  data 
base  for  proving  T^ .  If  every  clause  in  proof[T] 
has  at  least  one  image  clause,  then  simply  modi¬ 
fying  the  QA3  data  base  is  magnificently  helpful. 
However,  if  the  analogy  is  weak  and  we  have  only 
a  partial  set  of  images,  what  can  we  do?  If  every 
predicate  used  in  the  proof [T]  has  an  image,  we 
could  restrict  our  data  base  to  just  those  clauses 
containing  the  image  predicates.  Could  we  do 
better?  And  what  do  we  do  with  a  partial  analogy 
in  which  some  clauses  and  some  predicates  have 
images,  but  not  all  of  either?  At  this  point  we 
meet  limitations  imposed  by  the  design  of  QA3. 

All  contemporary  theorem  provers,  including  QA3, 
use  a  fairly  homogeneous  data  base.  QA3  does  give 
preference  to  short  clauses,  since  it  is  built 
around  the  unit-preference  strategy.  But  it  has 
no  way  of  focusing  primary  attention  upon  a  select 
subset  of  axioms  A*,  and  attending  to  the  re¬ 
maining  axioms  in  D  -  A*  only  when  the  search  is 
not  progressing  well.  One  can  contrive  various 
devices,  such  as  making  the  clauses  in  A*  "pseudo¬ 
units"  that  would  be  attended  to  early.  Or,  with 
torch  and  sword,  one  could  restructure  QA3  around 
a  "graded  memory."  (7)  Basically  we  have  to  face 
the  fact  that  our  contemporary  strategies  for 
theorem  proving  are  designed  to  be  as  optimal  as 
possible  in  the  absence  of  a  priori  problem- 
dependent  information.  And  these  optimal  strate¬ 
gies  are  difficult  to  reform  to  wisely  exploit  a 
priori  hints  and  guides  that  are  problem  dependent. 
This  is  not  to  say  that  various  kinds  of  a  priori 
information  cannot  be  added.  Rather,  it  is  a 
separate  and  sizable  research  task  to  decide  how 
to  do  it.  I  presume,  but  do  not  know,  that  these 
comments  extrapolate  to  other  problem-solving  pro¬ 
cedures,  and  a  system  that  is  organized  around  a 
priori  hints,  heretofore  user  supplied,  may  look 
very  different  than  one  which  is  designed  to  do 
its  best  on  its  own.  QA3  was  chosen  because  it 
was  available  and  saved  years  of  work  developing 
a  (new)  suitable  theorem  prover.  However,  further 
research  in  AR  may  well  benefit  from  relating  to 
a  more  flexible  theorem-proving  system. 

WHAT'S  NEW? 

What  does  ZORBA  add  to  our  understanding  of 
AR?  What  does  ZORBA  leave  unanswered?  Pre-ZORBA, 
most  researchers  believed  that  analogies  would 
relate  to  plans  and  (possibly  to  probably)  include 
some  sort  of  semantic  information.  ZORBA  adds  the 
following  insights  to  our  understanding  of  AH: 

(1)  Some  fairly  interesting  AR  can  be  handled 
by  modifying  the  environment  in  which  a 
problem  solver  operates  rather  than 
forcing  the  use  of  a  sequential  planning 
language . 
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(2)  Each  problem  solver/theorem  prover  will 
use  different  a  priori  information  and 
consequently  will  require  different 
analogy-generation  programs. 

(3)  A  good  analogy  generator  will  output 
some  information  helpful  to  speeding  up 
a  problem  search  as  a  byproduct  of  a 
successfully  generated  analogy. 

(4)  Part  of  the  problem  of  AR  is  to  specify 
precisely  how  the  derived  analogical  in¬ 
formation  is  to  be  used  by  the  problem 
solver, 

(5)  An  effective,  nontrivial  analogy  genera¬ 
tor  can  be  adequately  built  that  uses  a 
simple  theory  and  primitive  semantic 
selection  rules. 

(6)  Although  analogies  are  nonformal  and  are 
semantically  oriented,  nontrivial  analo¬ 
gies  can  be  handled  by  a  special  system 
wrapped  around  a  highly  formal  theorem 
prover. 

In  contrast,  ZORBA  neglects: 

(1)  Methods  for  handling  those  analogies 
that  absolutely  require  a  planning  level 
generalization  and  sequential  informa¬ 
tion. 

(2)  Very  weak  analogies. 

(3)  What  to  do  with  many  rules  of  inference. 

(4)  How  to  describe  the  "structure  of  an 
analogy. ” 

ZORBA  makes  a  substantial  contribution  to  our 
pale  understanding  of  AH,  and  in  the  process 
helps  articulate  additional  questions  that  reveal 
our  vast  ignorance  of  analogical  ways  of  knowing. 

REFERENCES 

1.  N.  J.  Nilsson,  Problem  Solving  Methods  in 
Artificial  Intelligence  (McGraw-Hill,  to  be 
published  1971). 

2.  G.  W.  Ernst  and  A.  Newell,  "Some  Issues  of 
Representation  in  a  General  Problem  Solver," 
AFIPS  Conference  Proceedings,  Vol.  30  (1967), 
pp.  583-600. 

3.  R.  E.  Fikes,  "REF-ARF:  A  System  for  Solving 
Problems  Stated  as  Procedures,"  Artificial 
Intelligence,  Vol.  1,  pp.  27-120  (1970). 

4.  R.  E.  Kling,  "An  Information  Processing  Ap¬ 
proach  to  Reasoning  by  Analogy,"  Artificial 
Intelligence  Group  TN10,  Stanford  Research 
Institute,  Menlo  Park,  California  (June  1969). 


5.  C.  Green,  "Theorem  Proving  by  Resolution  as  a 
Basis  for  Question  Answering  Systems,"  in 
Machine  Intelligence,  Vol.  4,  D.  Michie  and 
B.  Meltzer,  eds.  (Edinburgh  Univ.  Press, 
Edinburgh,  Scotland,  1969). 

6.  R.  E.  Kling,  "Reasoning  by  Analogy  with  Ap¬ 
plications  to  Heuristic  Problem  Solving:  A 
Case  Study,"  Stanford  University  Ph.D.  Thesis 
forthcoming. 

7.  R.  E.  Kling,  "Design  Implications  of  Theorem 
Proving  Strategies,"  A1  Group  Technical 
Note  44,  Stanford  Research  Institute,  Menlo 
Park,  California  (1970). 


ACKNOWLEDGMENT 

The  research  reported  herein  was  sponsored  by 
the  Advanced  Research  Projects  Agency  and  the 
National  Aeronautics  and  Space  Administration  under 
Contract  NAS12-2221 . 


18 


